In [1]:
%%HTML
<style>
.container { width:100% }
</style>
This short notebook demonstrates that working with NumPy arrays is much faster than working with Python lists.
In [2]:
import numpy as np
We begin by defining two NumPy arrays a and b that are each filled with a million random numbers.
In [3]:
a = np.random.rand(1000000)
b = np.random.rand(1000000)
Next, we compute the dot product of a and b. Mathematically, this is defined as follows:
$$ \textbf{a} \cdot \textbf{b} = \sum\limits_{i=1}^n \textbf{a}[i] \cdot \textbf{b}[i], $$
where $n$ is the dimension of aand b. In Python we can use the operator @ to compute the dot product.
In [4]:
%%time
a @ b
Out[4]:
To compare this time with time that is needed if a and b are stored as lists instead, we convert a and b to ordinary Python lists.
In [5]:
la = list(a)
lb = list(b)
Next, we compute the dot product of a and b using these lists.
In [6]:
%%time
sum = 0
for i in range(len(la)):
sum += la[i] * lb[i]
We notice that NumPy based computation is much faster than the list based computation. Similar observations can be made when a function is applied to all elements of an array. For big arrays, using the vectorized functions offered by NumPy is usually much faster than applying the function to all elements of a list.
In [7]:
import math
In [8]:
%%time
for i, x in enumerate(la):
lb[i] = math.sin(x)
In [9]:
%%time
b = np.sin(a)
In [ ]: